Extracting hidden weak sinusoidal signal with short duration from noisy data: Analytical theory and computational realization
Zhang Ying1, Zhang Zhaoyang1, 2, Qian Hong3, Hu Gang1, †
Department of Physics, Beijing Normal University, Beijing 100875, China
Faculty of Science, Ningbo University, Ningbo 315211, China
Department of Applied Mathematics, University Washington, Seattle, WA 98195, USA

 

† Corresponding author. E-mail: ganghu@bnu.edu.cn

Abstract

Signal detection is both a fundamental topic of data science and a great challenge for practical engineering. One of the canonical tasks widely investigated is detecting a sinusoidal signal of known frequency ω with time duration T: , embedded within a stationary noisy data. The most direct, and also believed to be the most efficient, method is to compute the Fourier spectral power at ω: . Whether one can out-perform the linear Fourier approach by any other nonlinear processing has attracted great interests but so far without a consensus. Neither a rigorous analytic theory has been offered. We revisit the problem of weak signal, strong noise, and finite data length T = O(1), and propose a signal detection method based on resonant filtering. While we show that the linear approach of resonant filters yield a same signal detection efficiency in the limit of , for finite time length T = O(1), our method can improve the signal detection due to the highly nonlinear interactions between various characteristics of a resonant filter in finite time with respect to transient evolution. At the optimal match between the input I(t), the control parameters, and the initial preparation of the filter state, its performance exceeds the above threshold B considerably. Our results are based on a rigorous analysis of Gaussian processes and the conclusions are supported by numerical computations.

1. Introduction

Signal detections are crucial tasks in various fields.[13] In many practical situations, signals may be very weak and short-lived, and they are often contaminated by strong noise. How to extract weak signal from data with a short duration in the presence of strong noisy background turns out to be a significant issue both in theories and in practices.[48] We shall revisit the simplest as well as most basic signal-detection problem: Assume a section of sinusoidal signal of duration T, , (the green curve in the small frame of Fig. 1(a)) is embedded in a strong white noise background as shown in Fig. 1(a). Here is 1 for and zero otherwise. Throughout the paper, ω0 and T are assumed to be known, for in many practical cases signals can be emitted by the receivers themselves. Also, we take and D = 0.5. Due to extremely small signal-to-noise ratio proportional to (), one can hardly find any trace of signal from the whole input shown as blue curve in Fig. 1(b). A direct way of sinusoidal signal detection is to make Fourier analysis and test different τ for finding the time period with signal . We thus have

which is expected to give the best distinction of the signal part from the noise background if the τ used in Eq. (1) just fits the detected signal segment of . In Fig. 1(c) we plot versus τ for the input I(t) in Fig. 1(b), and it seems that from there is not any detectable feature responding the signal in Fig. 1(a).

Fig. 1. (color online) Detecting weak and shortly-lived signal from strong noise background. (a) Pure noise time sequence , D = 0.5 (red curve), and signal in the time interval ( to ) with , ω = 0.02, A = 0.006 (green curve). (b) Time sequence of weak signal contaminated with strong noise with and given in panel (a), from which one can hardly find any trace of . (c) computed by Eq. (1) for A = 0.006 with finite time length . Note, signal exists in time interval , which cannot be observed in the plots. (d) The same as panel (c) with longer signal interval, . during the signal interval is considerably above that in the rest times. Now signal in interval with can be evidently extracted.

For many years, scientists tried to answer some fundamental as well as practical questions related to the signal detection of Fig. 1, such as, (i) Whether we can enhance the signal-to-noise ratio (SNR) exceeding the input B by any designed means; (ii) How much we can do this if we can. Many methods have been suggested in previous investigations,[912] among which the method of stochastic resonance (SR) has been expected to be most hopeful.[9,10] However, despite of the efforts of more than two decades, no satisfactory results have been obtained. First, due to the nonlinearity of SR systems, no exact analysis is possible and, in particular, no convincing answer based on rigorous mathematical–physical computations to the first question has been made. Second though some numerical analysis show enhancement of SNR of SR output out-performing B of input, the enhancement is possible only for the inputs with rather large SNR for which signal can be easily detected even without SR treatment.[1315] The practically meaningful problem, whether and how we can enhance the effect of signal detection over the threshold of the input B with weak signals contaminated by strong noises is remaining not answered. Third, all previous investigations have focused on infinitely long time length. The problem of practical significance how to enhance the detection efficiency of the shortly-lived signal from strong noise background has seldom been considered before.

The central tasks of the present paper are the following. i) Considering the problem of extracting weak signal from strong noise background; ii) Detecting signal with short life time; iii) Most importantly, applying rigorous computation to analytically answer whether we can and how much we can enhance SNR of outputs from the result of inputs.

For many years scientists have been familiar with the method to use linear resonant filters to amplify the intensity of input signals with certain frequency. It is commonly believed that linear filters can never enhance SNR of outputs over the SNR of the inputs, for all linear filters can equally amplify the signal and the background noise round its resonant frequency due to the linearity of the filters, and then these devices can never enhance the ratio of SNRout/SNRin over 1. This understanding is valid and exactly proven for any sinusoidal signal with infinite length. It has never been aware so far that this seemly solid conclusion may not been valid in the transient processes of linear filters which treat shortly-lived signal detection. It is found in this work that for finite time inputs of noisy signal interact with system parameters and initial prepared states in a highly nonlinear manner and this nonlinearity provides a possibility to enhance the output SNR exceeding the input SNR.

This paper is organized as follows. In Section 2, we describe our linear filter system and define SNR’s for both input and output with finite signal-life-time, and analytically compute SNR of the input. In Section 3, the output SNR is analytically derived too. SNRout/ in certain parameter regimes is rigorously proven by theoretical computations. And numerical simulations confirm all theoretical results and confirm the conclusion that for an input with weak and shortly-lived signal and strong noise, linear filters can indeed considerably enhance our ability to extract weak signal from strong noise under suitable match among input, filter control parameters and preparation of initial states of filters.

2. Tasks extracting weak signals with short time durations from strong white noises

Considering the simplest as well as the most fundamental problem of sinusoidal signal extraction from noise for a piece of time sequence

where all D, A, ω (also ), and T are supposed to be known. Assume a task that an active detector emits a sinusoidal signal with a given known frequency ω and time duration T, the signal is reflected (or not reflected) back from certain object in some unknown time interval. The amplitude of the reflected signal A can be well estimated for the detection task, and the noise background D can be easily computed from the received data. The central problem is greatly simplified to answer whether the sinusoidal signal with known amplitude A exists in a given received data of I(t) (Eq. (2)). The most direct method to extract the signal[10] is to compute Fourier spectrum at frequency ω
For , we have , and the signal can be definitely identified to exist or not from the computed B value (referring to Fig. 1(d) for large while still finite T). For small T, such as , is no longer valid (referring to Fig. 1(c)). Nevertheless, with a certain value of the computed B, one can give an answer on probability only with which I(t) includes the given signal. In the present work, all of our results rely on systematical probability computation that is important in realistic detections of signals with short durations, which has been seldom used so far, to the best of our knowledge, in previous works signal detections were performed often with infinitely long life time.

For any algebraic summations of Gaussian distributions must be a Gaussian distribution, we can generally write the (B1, B2) distribution as a two-dimensional joint Gaussian distribution

Here and are the mean values of B1 and B2, σ1 and σ2 are the corresponding variances of B1 and B2, σ12 are the covariance of B1 and B2, and . By means of coordinate transformation from Cartesian (B1, B2) to polar () (, ), the distribution of the amplitude of the composition can be written as

For with variables and being given in Eq. (2), the distribution of B can be specified as the following expression

Without signal A = 0, the formula can be further simplified to the following form
With the given parameters A, T, and D, the computed B can appear anywhere in the range of () with different probabilities. For , equation (6) reduces to the deterministic result: a sharp Gaussian probability distribution centered at . Now the problem to detect whether a signal exists in I(t) in a piece of received data of length T can be answered in a probabilistic way by comparing the probability distributions of the cases: Eq. (7) and Eq. (6) .

In Figs. 2(a) and 2(b) we fix different finite s for no signal (red curves) and with signal (blue curves), and plot the theoretical results (continuous curves) and numerical computations (dots), both results agree with each other perfectly.

Fig. 2. (color online) Probability computed by Eq. (6) (continuous curves) and by numerically realizations of Eq. (3) (dots) for the stochastic input I(t) with (, blue curve and dots) and without (, red curve and dots) signal. The task is to answer whether a given received input data of length T includes signal . realizations are used for averages of numerical dots. (a) . Both distributions ʼs of (red) and (blue) strongly overlap. It is difficult to make correct prediction. (b) The same as panel (a) with . With longer T the two distributions of and separated considerably. The probability to make correct conclusion becomes higher than panel (a). In both panels (a) and (b) theoretical predications of Eq. (6) and numerical results coincide with each other very well.
3. Detection of weak and shortly-lived signal from strong noise by using resonant filters

Now we inject the received data I(t) into a resonant filter[16]

According to Langevin theory, the transient solution of the system can be written as
where
with
In the above expressions, is the initial condition, is the stochastic input with signal, and γ is the damping coefficient of the linear filter system. The white noise, in terms of stochastic differential equations, corresponds to the Stratonovich formalism.[16]

Since this equation is exactly solvable, it seems that there is nothing interesting with further analysis. For for any initial condition the system evolves to an asymptotic destination where the initial condition is forgotten, and SNR of is exactly the same as SNR of I(t) due to the linearity of the dynamics. However, for T being finite, the case turns to be much more complicated. The system is still solvable, but the solution turns to be strongly nonlinear with the initial condition where noise, signal, and all system parameters interact in a very complex manner. Some interesting results appear by these interactions.

The signal can be extracted from for finite time sequences of length NT0. By numerical computations, similar to that for I(t), here B is given by

The computational realization of the distributions of and are dotted in Fig. 4, comparing to the inputs in Fig. 2, the two distribution curves seemly separate more from each other, and thus the filter makes some difference indeed.

Fig. 4. (color online) The same as Fig. 2 with formula of Eqs. (5), (11), and (14) (continuous curves) and numerical results of Eq. (10) (dots) applied. Parameters of the filter Eq. (8) are given as: , , and . The input I(t) in Fig. 2 is replaced by the output of Eq. (8).

Probability of can be theoretically carried out, similar to that for the input, though the former is much more tedious than the latter and the results are not explicit. Analytic and exact formula of Eq. (5) for with different mean values and variances can be reached. From the deterministic part of Eq. (9), and , the mean values of B1 and B2 can be obtained as

The variances of B1, B2, and their covariance define σ1, σ2, and σ12. All of them depend on the random item only, and they read
with , , and , where
So that, equations (12) can be expressed into a unified form with
Finally, probability of for finite time sequences of length NT0 can be solved analytically while implicitly by inserting the solutions (11) and (14) to Eq. (5).

In Eqs. (10)–(14), all the controllable matters of the filter, including parameters γ [related to the quality factor of the filter, which also take effect through referring to Eq. (9)], and initial values of interact with the facts of the input NT0 and ω in strongly nonlinear manner, and effectively influence the probability distribution. The central task of the present paper is schematically shown in Fig. 3, how the above facts alter the resulting probability distributions and whether the efficiency of the weak signal detection by the resonant filter can out-perform the algorism of Eq. (5) for the input.

Fig. 3. (color online) Schematic figures of signal detection tasks. (a) Direct signal detection from the input I(t). (b) Signal detection from the output of a selected linear filter driven by the input I(t). The central problem is: whether the latter analysis can out-perform the former signal detection directly from the input.

With the two different inputs I(t) of Figs. 2(a) and 2(b) we compute, in Figs. 4(a) and 4(b), data of output ʼs from filter (8) for certain optimal parameters. A number of features of the results are interesting. First, the theoretical predictions of Eqs. (5), (11), and (14) coincide with numerical plots very well. Second, we find interestingly that data for in Figs. 4(a) and 4(b) are seemly better than those for I(t) in Figs. 2(a) and 2(b) by providing larger separations of the two compared distributions. In particular, for rather short T, the distributions of A = 0 and in Fig. 4(a) are separated more than those in Fig. 2(a).

4. How to enhance the efficiency of detections of weak and shortly durative signals by adjusting parameters of resonant filters

It is emphasized that with available known information we can well draw the two distributions while we cannot control the value of B for each realization, and cannot definitely conclude which B corresponds signal (or no signal). The most probabilistic way to adjudge whether I(t) includes () or not includes () signal with Eq. (6) with a given computed B is: Yes (blue) for or no (red) for . For quantitatively measuring the efficiency of signal extraction with the computed B values in Eqs. (3) and (10), we define a quantity of separation of probability distributions of data with and without signal, based on the measurement of various areas of Fig. 5(a). Taking the two curves in Fig. 2(b) as examples, the two distributions have equal probability density at (dotted line in Fig. 5(a)). We have for , respectively. Therefore, we more likely estimate (signal received) for any while (no signal received) for . The correctness percentage for these probabilistic estimations read

Here is the probability of successful signal extractions and is the probability of successful identification of no signal. In both cases, wrong estimations come from the overlapped area of the two distributions (with and without signal). We can thus simply define the Separation of Probability Distributions (SPD) of the two cases
as the measure of the overall probability of correct signal (or no signal) detection. Larger SPD represents better signal extraction.

Fig. 5. (color online) Definition and computation of separation of probability distributions SPD with signal from that without signal. (a) Schematic figure on signal prediction and definition of SPD. At we have , and for , respectively. We thus most probably predict signal when while no signal for . And the overall probability of correct predictions (including correct predictions of signal and no signal) reads . (b) SPDI for the input I(t) of Eqs. (3) and (6) (red colored plots) and SPDx for the output of Eqs. (10) and (5), (11), (14) (blue colored plots) plotted against T. SPD is identified by theoretical and rigorous derivations (continuous curves) and numerical computations (dots). In particular, ratio SPD is rather larger than 1 for small Tʼs.

Equation (16) can be analytically computed for both input of I(t) (Eq. (2)) and output (Eq. (8)), and can be also numerically computed, based on various noise realizations. In Fig. 5(b) we plot SPD’s for I(t) and against different Tʼs with other parameters fixed as , , and . A striking observation is that the SPD values of output are considerably higher than those of input I(t). And for small Tʼs the former is more than several times higher than the latter. This is the very first time that a rigorous analysis shows possibility to enhance the efficiency of signal detection. All the theoretical results (solid lines) are perfectly confirmed by numerical computations (dots).

In Fig. 6 we show correct and incorrect conclusions of signal detection for I(t)ʼs with 200 and 2000 realizations, 100 and 1000 for A = 0, and 100 and 1000 for . In Fig. 7 we do exactly the same as Fig. 6 by using the same I(t)ʼs, but with the outputs of the filter Eq. (8), data, considered. By choosing proper parameters we obtain better detection results, for obviously more realizations move to their correct areas than the realizations oppositely moved.

Fig. 6. (color online) Signal predictions of 100 [(a) and (b)] and 1000 [(c) and (d)] realizations by numerically computing Eq. (3) for I(t) without signal (A = 0, (a) and (c)) and with signal (A = 0.006, (b) and (d)). D = 0.5, . Numbers of correct predictions are written in the corresponding legends.
Fig. 7. (color online) (a)–(d) The same as Figs. 6(a)6(d), respectively, by numerically computing Eq. (10) for . Parameters of the resonant filter are: , , and . In each case the detections with here have more numbers of correct conclusions than those with I(t) in Fig. 6.

The resonant filter does improve the signal detection referring to Fig. 8, where we do the same as Fig. 6 (panels (c) and (d)) and Fig. 7 (panels (c) and (d)) while by ten times. It is striking that, each time the movements of plots from I(t) to are in overall towards to directions of considerable improvements. And the improvements have rather high ratios for short Tʼs as shown in Table 1.

Fig. 8. (color online) Bar charts of signal detection results on ten groups of experiments of Figs. 6(c) and 6(d) and Figs. 7(c) and 7(d). Each group performs 1000 realizations of numerical computations of Eq. (3) (the input: (a) for A = 0, (b) for A = 0.006) and Eq. (10) (the output: (c) for A = 0, (d) for A = 0.006). Numbers of correct (red) and error (black) detections are ranged. It is obvious that with the optimal resonant filter more realizations move to the correct side than those moving to the wrong side (panels (e) and (f)).
Table 1.

Dependence of SPD’s of I(t) and , and their ratios on the time length.

.

Various parameters such as γ, x0, and are included in distribution of Eqs. (5), (11), and (14). It is interesting to study how these additional physical quantities in the filter, beyond the input data I(t), influence the detection efficiency.

In Fig. 9(a) we plot SPDx against γ with other parameters fixed, the probability of correct signal detection increases as γ reduces, and approaches saturation around . In Fig. 9(b) we plot SPDx versus initial energy preparation E () of the filter, SPDx increases monotonously as E and saturates to a certain value higher again than that of the input. In Fig. 9(c) SPDx is plotted against the initial phase angle θ (). There exists certain optimal phases, where SPD of the output is considerably better than that of the input I(t). All plots in Fig. 9 and Fig. 10 show convincingly that the controllable parameters in the filter detector are effectively involved in the signal extractions, and optimal match between these parameters can indeed do better in performing this task. Roughly speaking, smaller γ, larger E, and optimal initial phase arrangement can provide better results of signal detections. It is again emphasized that all numerical simulations well agree with the theoretical statistical analyses of Eqs. (6) and (5), (11), (14).

Fig. 9. Dependence of signal detection efficiency on control parameters. D = 0.5, A = 0.006, ω = 0.02, . (a) SPDx plotted against γ with and . SPD for I(t) is drawn for comparisons. There is an optimal , providing the largest SPDx which is much higher than SPDI. (b) The same as panel (a) with SPDx plotted versus E0 with and . SPDx increases as E0 and saturates to a value . (c) The same as panel (a) with SPDx versus θ for . SPDx reaches its maximum at . The results of panels (a), (b), and (c) show convincingly that the additional parameters of the filter can dramatically influence the efficiency of signal detection, and enhance the efficiency considerably higher than the input at some optimal parameter sets. Panels (d), (e), and (f) represent ratios corresponding to panels (a), (b), and (c), respectively.
Fig. 10. The same as Fig. 9 except
5. Conclusion

In conclusion, we have studied the problem of signal detection by focusing on the simplest and also most fundamental cases of white-noise-contaminated sinusoidal signal. The main characteristics of the task are weak signal, strong noise and finite length of received data. In these cases, the signal detection can be predicted only by probabilistic description. Detectors of linear resonant filters are designed to enhance the efficiency of signal detections. While for infinite length of input data, the results of signal detection is irrelevant of all parameters of detectors which cannot improve the detection efficiency due to the linearity of the detector dynamics, for finite time signal length all these parameters, including the initial preparations of detector states, can effectively change the detection results by interacting with the input signal and noise in highly nonlinear manner. At certain optimal parameter matches the designed method can improve the signal detection efficiency much better than that of the input. All the above results are derived analytically and rigorously, and numerical computations fully support the theoretical predictions.

In this work our analyses are based on much simplified situation of noisy sinusoidal signals with known ω, T, D, and even A, and the problem is only to answer: yes or no. Nevertheless, even in these simplest cases, the problem whether we can improve signal detection, out-performing the standard algorism of input B remains a problem with long debates and conflicts, and there has been so far no any rigorous analysis clearly answering this problem. Here, we give this fundamental problem a definite and positive answer.

Although we focused on extracting sinusoidal signal from strong noise contamination, this is the simplest as well as the fundamental signal detection problem. The extension to more general and also more difficult problems of extracting general periodic or non-periodic signals from noise background is of practical significance. We expect to go further to attack these goals in our future works.

With various deeper analyses and more effective improvements by considering more realistic facts, the results are expected to be applicable to practical signal detection problems where weak signal, strong noise and limited available data length are crucial.

Reference
[1] Green D M 1958 J. Acoust. Soc. Am. 30 904
[2] Martin R D Schwartz S C 1971 IEEE Trans. Inform. Theory. 17 50
[3] Wang Y Xu G Liang L Jiang K 2015 Mech. Syst. Signal Process. 54�?5 259
[4] Krasnenker V M 1980 Automation and Remote Control 41 640
[5] Jung P Hänggi P 1991 Phys. Rev. 44 8032
[6] Inchiosa M E Bulsara A R 1996 Phys. Rev. 53 R2021
[7] Duan F Chapeau-Blondeau F Abbott D 2013 Digit. Signal Process. 23 1585
[8] Siddagangaiah S Li Y Guo X Chen X Zhang Q Yang K Yang Y 2016 Entropy 18 101
[9] Hänggi P Inchiosa M Fogliatti D Bulsara A R 2000 Phys. Rev. 62 6155
[10] Casado-Pascual J Denk C Gómez-Ordóñez J Morillo M Hänggi P 2003 Phys. Rev. 67 036109
[11] Jimenez-Aquino J Romero-Bastida M 2011 Phys. Rev. 84 011137
[12] Duan F Chapeau-Blondeau F Abbott D 2014 Phys. Rev. 90 022134
[13] Khovanov I A 2008 Phys. Rev. 77 011124
[14] Ou Z Y 2012 Phys. Rev. 85 023815
[15] He L F Cui Y Y Zhang T Q Zhang G Song Y 2016 Chin. Phys. 25 060501
[16] Risken H 1984 The Fokker–Planck Equation Berlin Springer-Verlag 63 110 pp. 229–275